NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Is In-Context Learning a Type of Error-Driven Learning? Evidence from the Inverse Frequency Effect in Structural Priming

https://doi.org/10.18653/v1/2025.naacl-long.586

Zhou, Zhenghao; Frank, Robert; McCoy, R Thomas (April 2025, Association for Computational Linguistics)

Full Text Available
Inductive Bias Is in the Eye of the Beholder

https://doi.org/10.18653/v1/2023.genbench-1.12

Wilson, Michael; Frank, Robert (December 2023, Association for Computational Linguistics)

Full Text Available
Inductive Bias is in the Eye of the Beholder

Wilson, Michael; Frank, Robert (December 2023, Proceedings of the First Workshop on Benchmarking Generalization (GenBench), EMNLP, Association for Computational Linguistics)

Full Text Available
How Abstract Is Linguistic Generalization in Large Language Models? Experiments with Argument Structure

https://doi.org/10.1162/tacl_a_00608

Wilson, Michael; Petty, Jackson; Frank, Robert (November 2023, Transactions of the Association for Computational Linguistics)

Abstract Language models are typically evaluated on their success at predicting the distribution of specific words in specific contexts. Yet linguistic knowledge also encodes relationships between contexts, allowing inferences between word distributions. We investigate the degree to which pre-trained transformer-based large language models (LLMs) represent such relationships, focusing on the domain of argument structure. We find that LLMs perform well in generalizing the distribution of a novel noun argument between related contexts that were seen during pre-training (e.g., the active object and passive subject of the verb spray), succeeding by making use of the semantically organized structure of the embedding space for word embeddings. However, LLMs fail at generalizations between related contexts that have not been observed during pre-training, but which instantiate more abstract, but well-attested structural generalizations (e.g., between the active object and passive subject of an arbitrary verb). Instead, in this case, LLMs show a bias to generalize based on linear order. This finding points to a limitation with current models and points to a reason for which their training is data-intensive.1
more » « less
Full Text Available
NPIs Aren’t Exactly Easy: Variation in Licensing across Large Language Models

DeCarlo, Deanna; Palmer, William; Wilson, Michael; Frank, Robert (December 2023, Proceedings of the 6th BlackboxNLP Workshop: Analyzing and Interpreting Neural Networks for NLP, Association for Computational Linguistics)

Full Text Available
How poor is the stimulus? Evaluating hierarchical generalization in neural networks trained on child-directed speech

Yedetore, Aditya; Linzen, Tal; Frank, Robert; McCoy, R Thomas (July 2023, Association for Computational Linguistics)

Full Text Available
What affects Priming Strength? Simulating Structural Priming Effect with PIPS

Zhou, Zhenghao; Frank, Robert (January 2023, Proceedings of the Society for Computation in Linguistics)

Full Text Available
Subject-verb Agreement with Seq2Seq Transformers: Bigger Is Better, but Still Not Best

Wilson, Michael A.; Zhou, Zhenghao; and Frank, Robert (January 2023, Proceedings of the Society for Computation in Linguistics)

Full Text Available
How poor is the stimulus? Evaluating hierarchical generalization in neural networks trained on child-directed speech

Yedetore, Aditya; Linzen, Tal; Frank, Robert; McCoy, R. Thomas (January 2023, Proceedings of the conference Association for Computational Linguistics Meeting)

Full Text Available
Coloring the blank slate: Pre-training imparts a hierarchical inductive bias to sequence-to-sequence models

https://doi.org/10.18653/v1/2022.findings-acl.106

Mueller, Aaron; Frank, Robert; Linzen, Tal; Wang, Luheng; Schuster, Sebastian (May 2022, Findings of the Association for Computational Linguistics)

Relations between words are governed by hierarchical structure rather than linear ordering. Sequence-to-sequence (seq2seq) models, despite their success in downstream NLP applications, often fail to generalize in a hierarchy sensitive manner when performing syntactic transformations—for example, transforming declarative sentences into questions. However, syntactic evaluations of seq2seq models have only observed models that were not pre-trained on natural language data before being trained to perform syntactic transformations, in spite of the fact that pre-training has been found to induce hierarchical linguistic generalizations in language models; in other words, the syntactic capabilities of seq2seq models may have been greatly understated. We address this gap using the pre-trained seq2seq models T5 and BART, as well as their multilingual variants mT5 and mBART. We evaluate whether they generalize hierarchically on two transformations in two languages: question formation and passivization in English and German. We find that pre-trained seq2seq models generalize hierarchically when performing syntactic transformations, whereas models trained from scratch on syntactic transformations do not. This result presents evidence for the learnability of hierarchical syntactic information from non-annotated natural language text while also demonstrating that seq2seq models are capable of syntactic generalization, though only after exposure to much more language data than human learners receive.
more » « less
Full Text Available

« Prev Next »

Search for: All records